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Abstract 

Although some long noncoding RNAs (IncRNAs) have been shown to 
regulate gene expression in cis, it remains unclear whether IncRNAs 
can directly regulate transcription in trans by interacting with 
chromatin genome-wide independently of their sites of synthesis. 
Here, we describe the genomically local and more distal functions 
of Paupar, a vertebrate-conserved and central nervous system- 
expressed IncRNA transcribed from a locus upstream of the gene 
encoding the PAX6 transcription factor. Knocl<down of Paupar 
disrupts the normal cell cycle profile of neuroblastoma cells 
and induces neural differentiation. Paupar acts in a transcript- 
dependent manner both locally, to regulate Pax6, as well as distally 
by binding and regulating genes on multiple chromosomes, in part 
through physical association with PAX6 protein. Paupar binding sites 
are enriched near promoters and can function as transcriptional 
regulatory elements whose activity is modulated by Paupar tran- 
script levels. Our findings demonstrate that a IncRNA can function in 
trans at transcriptional regulatory elements distinct from its site of 
synthesis to control large-scale transcriptional programmes. 
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Introduction 

The resolution of two key questions would greatly improve our 
understanding of the functions of long noncoding RNAs (IncRNAs; 
>200 nucleotides]. First, is it more often the RNA product or 
else the act of transcription that conveys IncRNA function? Sec- 
ond, is any given IncRNA more likely to control transcription 
locally [in the vicinity of its locus) or else more distally in the 
genome? 



A number of IncRNAs have been shown to regulate transcription 
of neighbouring genes on the same chromosome in an apparent 
c!s-acting mechanism (Lai et al, 2013; Melo et al, 2013; Wang et al, 
2011). These IncRNAs appear to function near their site of synthesis, 
in either an RNA-dependent manner to mediate looping onto the 
promoter regions of their transcriptional targets, or by using 
RNA-independent mechanisms to locally alter chromatin status. By 
contrast, IncRNA transcripts have also been proposed to regulate 
gene expression in trans, without influencing transcription of their 
genomically neighbouring genes (Guttman et al, 2011; Hung et al, 
2011). Trans-acting IncRNAs include p53-induced IncRNAs involved 
in mediating the DNA damage response (Huarte et al, 2010; Hung 
et al, 2011), IncRNAs transcribed from within the promoters of cell 
cycle genes (Hung et al, 2011), IncRNAs that function in the control 
of pluripotency and lineage differentiation (Guttman et al, 2011) 
and those that are regulators of dosage compensation (Chu et al, 
2011; Simon et al, 2011). Other examples include Evf-2 which binds 
and modulates the activity of the homeodomain containing tran- 
scription factor Dlx2 (Feng et al, 2006), and Hotair, a IncRNA tran- 
scribed from the HoxC locus, which regulates the activity of HoxD 
cluster genes in trans and interacts with chromatin at over 800 
regions genome-wide (Chu et al, 2011; Rinn et al, 2007). LncRNAs 
therefore have the potential to interact with chromatin and 
specifically target multiple different loci genome-wide. 

LncRNA loci that are transcribed in the developing mouse central 
nervous system (CNS) show a preference to be located adjacent to tran- 
scription factor genes and thus may regulate their transcription (Ponjavic 
et al, 2009) . Here, we investigate the transcriptional function of a CNS 
expressed, imspliced, and chromatin-associated intergenic IncRNA 
termed Paupar that is divergently transcribed 8.5 kb upstream of Pax6. 
Paupar was prioritised for detailed experimental investigation from 
among those we catalogued previously (Ponjavic et al, 2009) owing to 
the atypical evolutionary conservation of its sequence and transcription 
and because of its physical proximity to the transcription factor Pax6. 

Pax6 is required for eye and diencephalon specification and is 
known to control progenitor cell potency, progenitor cell proliferation, 
neuronal cell sub-type specification and spatial patterning in a 



1 IVIRC Functional Genomics Unit, University of Oxford, Oxford UK 

2 CGAT University of Oxford, Oxford, UK 

3 Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, UK 

4 Department of Biochemistry, University of Oxford, Oxford, UK 

♦Corresponding author Tel: +44 1855 282554; Fax: +44 1865 282849; E-mail: Keith.Vance@dpag.ox.ac.uk 
"Corresponding author Tel: +44 1865 282690; Fax: +44 1865 285852; E-mail: Chris.Pontingpdpag.ox.ac.uk 
^These authors contributed equally to this work. 



296 



The EMBO Journal Vol 33 | No 4 | 2014 



© 2014 The AuthorsThis is an open access article under the terms of the Creative Commons Attribution License, 
which permits use, distribution and reproduction in any medium, provided the original work is properly cited. 



Keith W Vance et al Paupar, a trans-acting IncRNA 



The EMBO Journal 



dosage-sensitive manner (Georgala et al, 2011; Hill et al, 1991; 
Sansom et al, 2009; Shaham et al, 2012). Heterozygous human 
PAX6 mutations can result in aniridia and in a variety of structural 
brain abnormalities that closely resemble those seen in Small eye 
[sey] mice heterozygous for Pax6 mutations (Georgala et al, 2011; 
Hingorani et al, 2012). The proximity of the Paupar gene to Pax6 
suggested to us that it may be involved in the spatiotemporal control 
of Pax6 expression and hence that it may be important for nervous 
system development and neurological disease. 

Our results demonstrate functions for Paupar in the control of 
growth and differentiation in neural cells. In addition to conveying 
these functions locally, via transcriptional regulation of Pax6, we 
unexpectedly discovered that Paupar also functions distally in trans 
to control neural gene expression on a large scale. We mapped 
genome-wide Paupar occupancy in N2A neuroblastoma cells and 
identified hundreds of genes that are both bound and regulated by 
Paupar. We discovered that the Paupar transcript physically associ- 
ates with PAX6 protein and that Paupar and PAX6 co-occupy 
specific genomic binding sites. Our results also revealed that Paupar 
associates in trans with functional elements involved in transcrip- 
tional control and that the Paupar transcript can modulate these 
elements' activity. Our data therefore demonstrate that a single 
IncRNA transcript can bind and regulate the activity of multiple 
transcriptional regulatory elements on different chromosomes 
distinct from its site of synthesis. 

Results 

Conserved Paupar genomic organisation and transcription 

RNA polymerase Il-transcribed, CNS-expressed mouse IncRNAs tend 
to be evoludonarily constrained and to be preferentially located 
adjacent to transcriptional regulatory genes in the genome (Ponjavic 
et al, 2009, 2007). One of these IncRNAs (GenBank: AK032637), 
which we term Paupar [Pax6 Upstream Antisense RNA) , is a single 
exon IncRNA transcribed from 8.5 kb upstream of the Pax6 gene in 
mouse which lies entirely within the first intron of PaxSosl, a 
previously defined non-coding Pax6 natural antisense transcript 
locus (Fig lA; Alfano et al, 2005) . Rapid amphfication of cDNA 
ends (RACE) experiments in mouse neuroblastoma cells extended 
AK032637 by approximately 700 bp at the 5' end revealing mouse 
Paupar to be a 3.48 kb transcript (Fig IB). The Paupar locus 
contains two regions of high DNA sequence conservation across 
diverse vertebrates that unusually include fish and birds as well as 
mammals (Fig IB). The first of these regions lies just 5' upstream of 
the Paupar transcriptional start site and is likely to contain this 
transcript's promoter sequence. The second lies within the 
transcribed sequence and encompasses both a previously identified 
Pax6 neuroretina enhancer element (Plaza et al, 1999) and a region 
of the transcript that we predicted to contain a stem loop secondary 
structure (Ponjavic et al, 2009). The orthologous human transcript, 
transcribed from 8.6 kb upstream of the human PAX6 gene, was 
identified in foetal brain using RT-PCR and RACE and shows three 
regions of high sequence identity to its mouse orthologue (Fig IC). 
Paupar transcripts are known from dog, as well as from more 
distantly related vertebrates, frog and zebrafish (Fig IC). Paupar 
therefore is unusual in exhibiting higher degrees of sequence and 



transcriptional conservation than most IncRNA loci (Cabili et al, 
2011; Marques & Pouting, 2009; Uhtsky et al, 2011). 

Paupar transcript is chromatin associated and co-expressed witli 
Pax6 in tlie neural lineages 

To begin our investigation of Paupar function we first characterised 
its expression profile and sub-cellular localisation. We found that 
mouse Paupar is most highly expressed in the adult brain (Fig 1 D) and 
shows a clear correspondence in expression profile with Pax6 (Fig IE) . 
Notably, Paupar is expressed in the developing cerebellum in both the 
internal and external granular layers, where Pax6 is also strongly 
expressed (Supplementary Fig SIA). Given the apparent spatial co- 
expression of Paupar and Pax6, we then asked whether their expres- 
sion is temporally coordinated during retinoic acid (RA) -induced 
differentiation of mouse E14 embryonic stem (ES) cells. While Paupar 
expression is undetectable in self-renewing ES cells, it is rapidly and 
transiently up-regulated after 1 day of RA treatment before increasing 
again at 4 days (Fig IF), a profile similar to that observed for Pax6 
(Fig IG). Mouse neuro 2 A (N2A) neuroblastoma cells express both 
Paupar (at an average level of approximately 15 copies per cell [Sup- 
plementary Fig SIB]) and Pax6, but not PaxGOSl (Supplementary Fig 
SIC), and are widely used as an in vitro model of neuronal differentia- 
tion. In these cells, we found Paupar RNA (Fig IH), but not a control 
mRNA [Tbp; Fig II), to be nuclear-enriched and located mainly in the 
chromatin fraction, and noted that ENCODE data show human Paupar 
to be similarly present in the nucleus and chromatin (ENCODE Project 
Consortium, 2012). Together, these data suggest that Paupar may act 
locally to regulate Pax6 expression or that it may regulate similar 
biological processes as Pax6. 

Paupar regulates neural gene expression 

To investigate the functional importance of the Paupar transcript we 
performed transcriptome profiling of Paupar knockdown in N2A cells. 
We reduced Paupar expression by approximately 52 % , using transient 
transfection of a Paizpar-targeting shRNA expression vector (Fig 2A), 
and verified that the chromatin-associated fraction of Paupar is 
depleted using this approach (Supplementary Fig S2A) . Paupar knock- 
down resulted in statistically significant changes in the expression 
levels of 942 genes (False Discovery Rate [FDR] < 5%) compared to a 
non-targeting control (Supplementary Table 2); 654 (69%) of these 
genes were down-regulated and 288 (31 %) were up-regulated (Fig 2B). 

Paupar-regulated genes are significantly enriched in those 
involved in cell cycle control, specifically DNA replication and mito- 
sis, those playing a role in synaptic function, and those modifying 
chromatin and chromosome organisation (Fig 2C, Supplementary 
Table 3). To validate the changes in expression observed from the 
microarrays, we performed qRT-PCR for 12 Paupar-regulated genes 
with two additional Paupar-targeting shRNA expression constructs 
(Supplementary Fig S2B). We observed consistent changes for all 12 
genes and saw changes in expression commensirrate with the strength 
of Paupar knockdown indicating that transcript level changes are 
specific and are not likely to result from off-target effects. Further- 
more, Paupar overexpression induced dose-dependent changes in the 
expression of six out of eleven Paupar-regulated genes tested (Supple- 
mentary Fig S2C) . The Paupar transcript therefore appears to function 
as a large-scale regulator of gene expression in neural cells. 
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Paupar regulates neural growth and differentiation in N2A cells 

We next investigated the lole of Paupar in cell cycle control by 
assaying the effect of Paupar knock-down on the proliferation of 
N2A cells. Paupar knockdown cells accumulate in S and G2 phases 
(Fig 2D] indicating that this transcript is necessary for normal 
passage through S phase and entry into mitosis. Taken together with 
the temporally regulated expression of Paupar during neural differ- 
entiation, these data indicate that Paupar may be important for the 
control of neural progenitor cell proliferation and differentiation. 

To further investigate this hypothesis we generated stable Paupar 
loss-of-function N2A cell lines and analysed the role of the Paupar 
transcript in neural differentiation. We isolated and expanded two 
independent clones in which Paupar levels had been reduced by 
50-60% (Fig 2E]. Strikingly, the Paupar knockdown clones showed a 
clear increase in neurite outgrowth, a well-characterised feature 
of neuronal differentiation, compared to a non-targeting control 
line (Fig 2F and G) . Additionally, the Paupar knockdown clones also 
showed increased levels of the neuronal differentiation marker Tubb3 
(encoding tubulin beta-3 chain) and a moderate reduction in Pax6 
which is known to be down-regulated upon neuronal differentiation 
(Fig 2H and 1} . Together, these results indicate that Paupar regulates 
gene expression programmes that control neural growth and differen- 
tiation, acting to maintain the self-renewal of N2A cells. 

Paupar and Pax6 have both common and distinct 
transcriptional targets 

Given the known roles of Pax6 in controlling neural stem cell fate, 
we next sought to further investigate the effect of Paupar RNA 
reduction on the expression of Pax6. While we observed a small 
decrease in Pax6 transcript levels following stable Paupar knock- 
down (Fig 21), this finding cannot be interpreted unambiguously 
given that neural progenitor cells can down-regulate Pax6 as they 
become neurogenic (Hsieh cSi Yang, 2009) and that Pax6 is known to 
auto-regulate its own expression (Aota et al, 2003; Manuel et al, 
2007). We therefore reduced Paupar levels with two distinct 
shRNAs transfected into N2A cells (Fig 3A) and used qRT-PCR to 
measure acutely induced changes in Pax6 expression. Transient 
reduction in Paupar RNA levels up-regulated Pax6 in a dose- 
dependent manner: a maximum 54% reduction in Paupar levels 



resulted in an 80% increase in Pax6 expression (Fig 3A). These 
observations agreed with a small (1.2-fold), yet genome-wide 
non-significant, increase in Pax6 expression detected on the Paupar 
knock-down microarrays. 

Given the abihty of Paupar to regulate Pax6 expression, we 
sought to determine the extent to which this could explain the 
Paupar transcriptional response. Reduction of PAX6 protein levels 
in N2A cells by approximately 70%, through the transient trans- 
fection of a Paxe-targeting shRNA expression vector (Fig 3B), 
resulted in statistically significant expression level changes in 925 
genes (FDR < 10%; Fig 3C and Supplementary Table 2). Impor- 
tantly, we noted no change in the levels of Paupar transcript 
upon Pax6 knockdown (Supplementary Fig S3). Genes changing 
in expression, as expected from the role of PAX6 as a key neuro- 
developmental transcription factor, were enriched for genes 
involved in neurogenesis and transcription factor binding (Fig 3F, 
Supplementary Table 3). 

The set of genes showing significant expression changes in both 
Paupar and Pax6 knock-downs was 5.1-fold greater than expected 
by chance (P < 2.2 x 10"'^; Fig 3D), consistent with Paupar regu- 
lating Pax6 expression. A large majority of these genes showed 
positively correlated changes in expression for both Paupar and 
Pax6 knock-down (Fig 3E) indicating that while Paupar may 
repress Pax6 transcription, Paupar RNA and PAX6 protein cooper- 
ate to coordinate the expression of a common set of target genes. 
However, notwithstanding the significant overlap between genes 
regulated by Pax6 and Paupar, a large majority of Paupar respon- 
sive genes are not significantly altered by Pax6 knockdown 
suggesting that Paupar may also possess Pax6-independent trans- 
acting functions. Notably, genes regulated by Paupar but not by 
Pax6 are enriched for regulators of cell cycle control and chroma- 
tin organisation, while genes whose expression are controlled by 
both Paupar and Pax6 include many with synaptic functions 
(Fig 3F, Supplementary Table 3). 

Genome-wide binding profile of the Paupar IncRNA in N2A cells 

We next investigated whether Paupar might function as a trans- 
acting transcriptional regulator by binding to genomic locations 
distal to its own locus. We first mapped the genome-wide binding 
profile of Paupar using the recently described Capture Hybridisation 



Figure 1. Conservation and expression of Paupar. 

A Schematic illustration of the mouse Pax6 genomic territory displaying coding and non-coding transcript structures (NCBI37/mm9). 

B A detailed view of the mouse Paupar locus (red) indicating regions of vertebrate DNA sequence conservation and the location of sequence (blue) that, in human 

and quail, is a Pax5 neuroretina enhancer (Plaza et al, 1999). 
C Conservation and relative sizes of identified Paupar transcripts in vertebrates. For human and mouse Paupar, transcript start sites (arrows) and transcript ends 

were confirmed by RACE (primer sequences in Supplementary Table 1). The identified orthologous ESTs from dog (DN871729), frog (CX414799, DN054151 and 

DN054152), and zebrafish (CT684153 and CT584154) are unlikely to represent full-length transcripts. Each of these Paupar orthologues displays conserved genomic 

location and transcriptional orientation relative to Pax6. 
D, E Paupar is a brain-expressed IncRNA Paupar (D) and Pax6 (E) expression levels were measured across a panel of adult mouse tissues using quantitative RT-PCR (qRT-PCR). 

Results are presented relative to the average value of Capdh and Tbp reference genes. Mean values ± standard error (s.e.) shown, n = 3 replicates. 
F, G Similarly to Pax5, Paupar is up-regulated during neuronal differentiation of mouse ES cells. Neuronal differentiation of mouse ES cells was induced using RA. We 

determined the levels of Paupar (F) and Pax6 (G) using qRT-PCR. Results are expressed relative to an Idhl control which does not change significantly during 

differentiation. Mean ± s.e., n = 3. 

H, I Paupar is a chromatin-associated transcript that functions to regulate Pax5 expression. N2A cells were biochemically separated into cytoplasmic, nucleoplasm, 
420 mM salt and chromatin fractions. The relative levels of Paupar (H) and a control mRNA {Tbp) (I) in each fraction were determined by qRT-PCR. Mean 
values ± s.e. of three independent experiments. RT, reverse transcriptase. 
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Figure 2. Paupar functions to regulate genes involved in cell cycle control and synaptic function. 

A N2A cells were transfected with eittner a non-targeting control or a Poupar-targeting shRNA expression vector {sh408) and Paupar levels were determined by 
qRT-PCR 3 days later. 

B Paupar knockdown induces statistically significant changes in the expression of 942 genes in N2A cells (5% FDR; Supplementary Table 2). 
C Significant Gene Ontology annotation enrichments of Poupor-regulated genes (5% FDR, Supplementary Table 3). 

D Paupar is important for normal S-phase progression and entry into mitosis. Mouse N2A cells were transfected with either a control or a Poupor-targeting shRNA 

expression vector. Three days later cells were fixed, stained with propidium iodide and the DNA content measured using flow cytometry. 
E Paupar loss-of-function cell lines were generated by stable co-transfection of shRNA expression plasmids against either Paupar or a non-targeting control and a 

hygromycin expression vector for selection. qRT-PCR analysis confirms the generation of two clonal cell lines expressing reduced levels of Paupar. Mean values ± s.e. 
F Paupar knockdown cells display increased neurite outgrowth. Control and Paupar knockdown cells were imaged using bright field microscopy. Scale bar, 50 ^lm. 
G Quantification of neurite outgrowth. Cells with one or more neurites of length greater than twice the cell body diameter were scored as positive. Average 

values ± s.e., n = 3. A total of 100-200 cells were counted in each case. 
H, I The relative levels of the neuronal differentiation marker TubbS (H) and Pax5 (I) were quantified in Paupar knockdown and control cells using qRT-PCR. Samples 

were normalised using Capdh and are presented relative to expression in control cells (set arbitrarily to 1). Mean values ± s.e., n = 3. 



Analysis of RNA Targets (CHART)-Seq technique (Simon, 2013; 
Simon et al, 2011) in N2A cells. This approach uses anti-sense 
oligonucleotides to purify target IncRNAs and their associated 



chromatin complexes and thus identifies both direct and indirect 
genomic associations. We mapped accessible regions of the Paupar 
transcript based on RNase H sensitivity and designed four 
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Figure 3. Paupar has both PaxS-dependent and -independent functions in transcriptional regulation. 

A Paupar knockdown leads to an increase in Pax6 expression. N2A cells were transfected with two independent shRNA expression constructs targeting different regions 
of the Paupar transcript The levels of Paupar and the adjacent Pax6 gene were quantified using qRT-PCR 3 days later. Samples were normalised using Capdh and the 
results are presented relative to a non-targeting control (set at 1). Mean values ± s.e., n = 4, one-tailed t-test unequal variance. 

B Cells were transfected with either a non-targeting control or a Pox5-targeting shRNA expression vector and PAX6 protein levels were analysed by Western blotting 
3 days later. Lamin Bl expression was used as a loading control. 

C Pax6 knockdown resulted in statistically significant changes in the expression of 925 genes in N2A cells (10% FDR, Supplementary Table 2). 

D Intersection of PoxS and Poupor targets reveals a significant (Fisher's exact test) overlap approximately five times as large as expected by chance alone. 

E Target genes for both Paupar and PaxS show correlated expression, with the majority being positively regulated by both factors. +, positive dependency; -, negative 
dependency. 

F Enrichments of Gene Ontology categories in Pax5 and Poupar target genes. 



biotinylated DNA oligonucleotides, complementary to these regions 
[Supplementary Fig S4]. For a control oligonucleotide, we used a 
sequence corresponding to Escherichia coli LacZ which is absent 
from the mouse genome. Paupar probes showed strong enrichment 
[17-fold) of the Paupar transcript compared to the LacZ control 
[Fig 4A) , and did not enrich for negative control transcripts, Malatl , 
a nuclear IncRNA, or Gapdh mRNA. As expected from physical 
association of a nascent transcript with its site of synthesis, 
we observed specific enrichment of Paupar at its DNA locus 
[Supplementary Fig S4) . 

Following the CHART-seq protocol, we used RNase H elution to 
recover genomic DNA associated with endogenous Paupar 
transcripts and genomic DNA associated with the control oligo- 
nucleotide, sequencing replicate samples using the lUumina HiSeq 
system. Using the paired-end peak caller MACS2 [Zhang et al, 2008), 
we identified Paupar binding sites in comparison both to DNA 



recovered using the control LacZ oligonucleotide and to input DNA 
from N2A cells. We discovered thousands of peaks across the gen- 
ome, for example at the transcriptional start site [TSS) of £2/2, and 
at sites upstream of Sox2 and downstream of Hesl [Fig 4B] and 
defined Paupar binding sites as those peaks found in comparison 
to both input DNA and the control oligonucleotide samples 
[Fig 4C). Paupar occupancy at nine candidate binding sites was 
validated using CHART-qPCR in two further independent experi- 
ments [Supplementary Fig S4]. 

These 2,849 candidate Paupar binding sites [Supplementary 
Table 4) are generally widely distributed across the genome, show a 
significant three-fold depletion on the X chromosome (P < 0.001 by 
genome-wide simulation accounting for mappability and GC biases 
[Heger et al [2013); Supplementary Table 5; Fig 4D), and occur 
preferentially within the promoters and 5' UTRs of protein-coding 
genes [Fig 4E) . Candidate Paupar binding sites range from narrower 
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Figure 4. CHART-Seq analysis of Paupar genomic binding sites. 

A Specific enrichment of Paupar RNA using oligonucleotides complementary to accessible regions of Paupar, as determined by RNase H mapping (see Supplementary 

Fig S4), compared to the LacZ control. Mean value ± s.e., n = 4. 
B Sequencing of Poupar-bound DNA (RNase H elution) reveals peaks of Paupar binding, including those at the promoter of E2f2, upstream of Sox2 and downstream of 

Hesl. 

C-E Peaks were called by comparing with sequences both from control CHART-seq experiments and from input DNA. Here we only consider the 2,849 peaks common to 
both comparisons (C, and Supplementary Table 4). Paupar peaks are broadly distributed across the mouse genome (D) but are particularly enriched in 5' UTRs 
and gene promoters (E). Red arrowheads in (D) indicate the position of the Paupar locus. Asterisks in (E) indicate significance at 5% FDR (Benjamini-Hochberg). 

F The width distribution of Paupar binding peaks. 

G Representative categories from Gene Ontology analysis of genes associated with Paupar binding sites reveal enrichments for stem cell and neuronal categories 
amongst others (Supplementary Table 5). 
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focal peaks of 200-1,000 bp, similar to those previously described 
for Hotair and Terc IncRNAs (Cliu et al, 2011), to broader genomic 
regions of up to 9.5 Icb (Fig 4F). 

We examined the sequence of the Paupar binding locations for 
clues as to its genomic targeting. Using a local alignment approach 
(see Materials and Methods), we did not find evidence for 
sequence complementarity between Paupar and its binding loca- 
tions. However, de novo motif discovery (Supplementary Fig S5) 
identified a motif closely resembling a known PAX6 DNA binding 
motif in 9.2% of the top 500 scoring Paupar binding locations 
(Supplementary Fig S5C). Further analysis of Paupar CHART-Seq 
peaks for the presence of known vertebrate transcription factor 
motifs revealed enrichment of motifs for several neural transcrip- 
tion factors (Supplementary Table 7). Together, these results 
suggest that Paupar is not targeted to the genome through direct 
RNA-DNA interactions but that instead it interacts with PAX6 and 
other neural transcription factors to target specific genomic regions 
in a context-dependent manner. 

Paupar binding sites are associated witli genes and regulatory 
regions of known stem cell and neural function 

Using Gene Ontology assignments (Ashburner et al, 2000), we next 
sought to characterise the set of genes associated with Paupar 
binding locations (Fig 4G, Supplementary Table 6). This analysis 
revealed enriched stem cell categories with peaks associated with 
many key stem cell genes such as Sox2, Nanog, Hesl, HesS, Rbpj 
and Lif. We also found Paupar peaks to be enriched for neuronal 
gene categories as well as in genes whose products are important 
for the epigenetic regulation of gene expression. Intriguingly, 
Paupar binding sites are also enriched in genes associated with 
insulin secretion, a process in which Pax6 has an estabUshed role 
(Gosmain et al, 2012). 

To gain insight into the mechanism by which Paupar regulates 
gene expression, we examined its binding in relation to known 
functional regulatory regions. Firstly, we found that the majority of 
Paupar binding sites overlap DNase I hypersensitivity (HS) sites and 
CIS-regulatory elements identified by the Mouse ENCODE Project 
(Shen et al, 2012; Fig 5A, Supplementary Table 4). A large subset of 
binding sites are significantly associated with neuronal DNase I HS 
sites and a further subset are associated with DNase I HS sites found 
in embryonic stem cells (Fig 5B and 5C). In agreement with the 
enrichment of Paupar binding sites at 5' UTRs and promoters, we 
also observed a set of binding sites associated with features charac- 
teristic of transcriptional start sites including Pol II, predicted CpG 
islands, and tri-methylation of histone H3 at lysine 4 (H3K4me3; 
Fig 5D) . Additionally, we saw significant overlaps with Ctcf-binding 
regulatory elements and tissue-specific enhancers, defined using 
ratios of H3K4 mono- and tri-methylation (Shen et al, 2012). 

A subset of Poupar-regulated genes is also associated with 
Paupar binding 

To investigate the functional consequences of Paupar binding we 
intersected the Paupar CHART-Seq peak-set with our microarray 
analysis of Paupar and Pax6-mediated gene expression changes. This 
identified 242 Paizpar-bound and -regulated genes (5% FDR), repre- 
senting likely direct transcriptional targets, and 254 Paupar-bound 



and Pax6-regulated genes (10% FDR; Fig 5E). Hierarchical cluster- 
ing of the 242 Paupar-bound and -regulated genes indicated that 
while Paupar predominantly activates the expression of target 
genes it can also function to repress certain loci (Fig 5F). In accor- 
dance with this, genes associated with Paupar binding sites 
showed significantly higher expression in N2A cells than other 
genes (Supplementary Fig S6A, P < 2 x lO"'^). 

Notably, genes regulated by both Paupar and Pax6 showed a 
small but significant enrichment for associated Paupar binding sites 
(Supplementary Fig S6B). To dissect the relationship between gene 
expression changes in the Paupar and Pax6 knockdown experiments 
and the association of genes with Paupar binding further, we plotted 
these observations (Fig 5G). First, we noted a moderate significant 
correlation (0.20, P < 2.2 x 10"'^) between all genes tested for 
differential expression in the two knockdown experiments, and that 
this correlation was stronger for genes also associated with Paupar 
binding sites (0.31, P < 2.2 x 10""'). Secondly, we noted a distinct 
group of genes that are significantly positively regulated by both 
Paupar (5% FDR) and Pax6 (10% FDR) and associated with Paupar 
binding (red circles, lower left Fig 5G). In line with the observed 
intersection between genes regulated by Paupar and Pax6 (Fig 3), 
these findings suggest that Paupar and Pax6 co-operate to regulate a 
set of common target genes (red circles, Fig 5G) and further that 
Paupar can act independently of Pax6 to directly regulate the 
expression of a separate set of genes (red triangles). 

Paupar and PAX6 co-occupy specific genomic binding sites 

The identification of 71 Paupar and Pax6 co-regulated genes that are 
also bound by Paupar in their regulatory regions (Fig 5E) together 
with the discovery of the known PAX6 DNA binding motif from 
Paupar bound sequences (Supplementary Fig S5) suggested a 
functional interaction between PAX6 and Paupar. To investigate 
this, we first tested for a physical association between PAX6 protein 
and the Paupar transcript using UV cross-linking RNA immuno- 
precipitation (UV-RIP) in N2A cells and detected a fourfold enrich- 
ment of Paupar using an anti-PAX6 antibody compared to an 
isotype control (Fig 5A) . The Paupar-P AX6 interaction appears to be 
specific as we observed no enrichment of a negative control Ul 
snRNA transcript. 

We next measured PAX6 occupancy at a set of Paupar binding 
sites within the regulatory regions of genes whose expression changes 
in Paupar knockdown cells (Fig 5B). Using chromatin immunopre- 
cipitation (ChIP) , we found a strong enrichment of PAX6 at Paupar 
binding sites within the regulatory regions of genes whose expression 
changes significantly upon both Pax6 and Paupar knockdown. By 
contrast, the majority of the assayed Paupar binding locations associ- 
ated with genes that changed in expression upon Paupar but not Pax6 
knockdown showed little enrichment for PAX6 (Fig 6B) . 

Several IncRNAs titrate DNA binding transcription factors away 
from their genomic targets (Hung et al, 2011; RapicavoU et al, 2013; 
Sun et al, 2013). We therefore examined PAX6 occupancy at a 
number of Paupar binding sites following Paupar knockdown and 
discovered that Paupar depletion does not significantly affect PAX6 
chromatin occupancy at the regions tested (Supplementary Fig S7). 
In a similar manner, the androgen receptor (AR) -associated 
IncRNAs Pmcrl and Pcgeml do not affect AR DNA binding but 
instead function to recruit transcriptional co-factors to DNA bound 
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Figure 5. Association of Paupar binding sites witli regulatory regions. 

A The majority of Paupar binding sites overlap witti DNase I HS sites and transcriptional regulatory elements as identified by the ENCODE project (selected tissues; 
Supplementary Table 5). 

B Hierarchical clustering of HS sites intersecting Paupar binding sites (light yellow) reveals large clusters of HS sites identified in neural tissues and embryonic stem 

cells that are bound by Paupar in N2A cells. 
C Fold enrichment of Poupor CHART-Seq-DNase I HS site associations. 

D Hierarchical clustering of c/s-regulatory elements and Paupar peaks show groups of Paupar peaks associated with promoter-like features (H3K4me3, Pol II binding, 

CpC island predictions), peaks associated with Ctcf and peaks associated with tissue-specific enhancer elements. 
E The intersection of genes proximal (< 1 Mb) to Paupar peaks and genes changing expression upon Paupar (5% FDR) and PoxS (10% FDR) knockdown. 
F Heatmap displaying expression changes in the 242 Paupar bound and regulated genes from (E). 

C Analysis of Paupar-associated genes and changes in expression from Paupar and Pax6 knockdown array experiments reveals a set of genes positively regulated by 
both Paupar and Pax6 and directly bound by Paupar (red circles, lower left). Correlations (cor) are significant at P < 2 x 10"^^. 

Data information: Asterisks (C, D) indicate significance at 5% FDR (Benjamini-Hochberg). 
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Figure 6. Paupar functions in trans to modulate thie activity of neurodevelopmental gene transcriptional regulatory elements. 

A Paupar interacts with PAX5 protein in N2A cells. Nuclear extracts were prepared from UV cross-linked cells and immuno-precipitated using either anti-PAX6 or 
control IgC antibodies. Associated RNAs were purified and the levels of Paupar and control Ul snRNA detected in each UV-RIP using qRT-PCR. Results are 
expressed as fold enrichment relative to an isotype IgG control antibody. 

B PAX6 and Paupar co-occupy a specific set of genomic binding sites. ChIP assays were performed in N2A cells using either an antibody against PAX5 or an isotype- 



specific control. The indicated DNA fragments were amplified using qPCR. Fold enrichment was calculated as 2 



'(IP/IgG). 



C, D Paupar binding sites act as transcriptional regulatory elements. N2A cells were transfected with the indicated reporter constructs in a luciferase assay. Luciferase 
activity was compared to that of the empty SV40 promoter construct. 

E-C Paupar transcript modulates the transcriptional activity of its binding sites in trans. Luciferase reporters were co-transfected into N2A cells together with either a 
non-targeting control or two independent Poupor targeting shRNA expression vectors. Paupar depletion was confirmed using qRT-PCR. For these reporter assays, a 
Renilla expression vector was used as a transfection control and the total amount of DNA transfected in each case was made equal. 

Data information: Results are presented as mean values ± s.e., n = 3 (A-D) or n = 4 (E-C); ***P < 0.001, **P < 0.01, *P < 0.05, one-tailed t-test, unequal variance. 



AR (Yang et al, 2013). Taken together, these findings indicate that 
PAX6 and Paupar specifically co-occupy a subset of Paupar binding 
sites associated with genes whose expression change upon Pax6 and 
Paupar knockdown, and that PAX6 protein plays a role in targeting 
Paupar to the genome. 

Paupar modulates in trans the activity of transcriptional 
regulatory elements of neuro-developmental genes 

We next tested whether Paupar CHART-Seq peaks function as tran- 
scriptional regulatory elements. To do this, we first selected 11 
Paupar binding sites, ranging from 170 bp to 2.5 kb in length, 
located within the regulatory regions of genes controlling neural 



growth and differentiation (£2/2, E2f7 [2 peaks], Cdc6, CdknZc, 
Kdm7a [2 peaks], Soxl,Sox2,Hoxal,Hesl] and cloned each upstream 
of a heterologous SV40 promoter in a pGL3 luciferase vector. All of 
these genes, except Sox2 whose expression is undetectable in N2A 
cells, showed evidence of being regulated by Paupar using microarray 
profiling. The transcriptional activity of these constructs was then 
compared to that of the SV40 promoter alone following transient 
transfection in N2A cells. One (E2/2BS] out of six focal peaks of 
< 600 bp in length strongly activated (3.9-fold) the SV40 promoter, a 
further three displayed a small, but significant (1.3-1.7-fold), increase 
in SV40 promoter activity (Fig 6C) , while four out of five broad peaks 
> 1 kbp (Fig 6D) reproducibly repressed the SV40 promoter (1.6-1.8- 
fold}. Consistent with the observed enrichment of Paupar 
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CHART-Seq peaks at neuronal DNase I HS sites and ris-regulatory 
elements (Fig 5] these results indicate tliat Paupar binding sites 
function in transcriptional control and can operate as both transcrip- 
tional enhancers and repressors. 

Finally, we examined whether the Paupar transcript plays a 
trans-acting role in modulating the transcriptional response of its 
genomic binding sites. For this we chose to investigate the 
dependence on the Paupar transcript of the transcriptional activity 
of the strong £2/2 enhancer element as well as the weak E2f7 acti- 
vating region and the repressor elements associated with the Sox2, 
Hesl and Hoxal genes. To achieve this, reporters were co-transfected 
into N2A cells along with either a non-targeting control or two inde- 
pendent Paupar-targeting shRNA expression vectors in a luciferase 
assay. Strikingly, depletion of the endogenous Paupar transcript led 
to a dose-dependent increase in the enhancer activity of the 
SV40-£2f2BS reporter (Fig 6E and F). In accordance with this, £2/2 
displayed one of the greatest expression changes (1.9-fold up-regulation; 
Supplementary Table 2] in our transcriptome profiling of Paupar 
knockdown cells. The activity of the SV40-£2/'7BS reporter was not 
altered by Paupar loss of function in this assay. Furthermore, 
Paupar depletion suppressed the ability of both the SV40-So:ic2BS 
and SV40-HesJBS reporters to function as transcriptional repressors, 
in a manner dependent on the levels of the Paupar transcript, while 
the repressive function of the SV40-HoxaJBS reporter was not 
altered (Fig 5G). These findings demonstrate a transcriptional regu- 
latory function for the Paupar transcript at its genomic binding sites 
in trans and reveal that a IncRNA can function at transcriptional 
regulatory regions on different chromosomes distinct from its site of 
synthesis to control target gene expression. 

Discussion 

The extent by which IncRNAs contribute to genome function 
remains unclear. Here, we have used shRNA mediated knockdown, 
microarray profihng, genome-wide mapping and reporter assays to 
detail the RNA-dependent mode of action of Paupar, a chromatin- 
associated CNS expressed IncRNA transcript, in the regulation of 
gene transcription. Our findings that the Paupar transcript acts both 
locally, to regulate Pax6 expression, and distally in a transcript- and 
Paz6-dependent manner, reveal that IncRNAs can be functionally 
complex and will not always fall exclusively into cis- or trans-acting 
categories. Our data indicate that the Paupar transcriptional 
response is driven by at least three contributing components: down- 
stream gene expression changes arising from Paupar regulation of 
Pax6 expression; Pax6- and Paupar-dependent trans-acting gene 
expression changes likely mediated through a physical interaction 
between Paupar and PAX6 protein at Paupar associated loci; and 
trans-acting Pax6-independent Paupar functions at many bound 
transcriptional regulatory regions genome-wide. 

Consistent with a role for Paupar in regulating Pax6 expression, 
the Paupar locus has characteristics of a transcriptional enhancer. It 
spans both a previously defined Pax6 neuroretina specific enhancer 
conserved between human and quail (Plaza et al, 1999} and a 
known N2A cell DNase I HS site (McBride et al, 2011). Furthermore, 
ENCODE data indicate its locus to be marked by a high ratio of 
histone H3K4mel compared to H3K4me3, with high levels of 
H3K27me3 and an absence of H3K27ac in the mouse E14.5 brain. 



Different classes of enhancer-associated cis-acting ncRNAs have 
been described to date. Enhancer RNAs (eRNAs) are relatively short 
polymerase Il-transcribed, predominantly non-polyadenylated, 
bidirectional transcripts, first identified at neuronal enhancers (Kim 
et al, 2010} while enhancer-like IncRNAs are strand-specific, poly- 
adenylated transcripts (De Santa et al, 2010; Orom et al, 2010} 
which in these respects are more similar to Paupar. The chromatin 
signature of the Paupar locus, along with the finding that Paupar 
regulates Pax6 expression, implies that Paupar may play an 
important role in nervous system development. Furthermore, the 
widespread effect on over 900 genes when Paupar transcript levels 
were reduced by 52% may be associated with the haploinsufficiency 
and dosage-sensitivity of Pa;c6 (Georgala et al, 2011}. 

We show that Paupar associates with approximately 2800 sites 
in the genome and, where tested, these regions operate as transcrip- 
tional regulatory elements whose activity can be modulated by 
Paupar transcript levels. Our data are consistent with a model in 
which Paupar is indirectly targeted to the genome through RNA- 
protein interactions with multiple different neural transcription 
factors including PAX6. In accordance with this, we discovered a 
motif resembling a known PAX6 DNA binding motif within approxi- 
mately 9% of the 500 top-scoring Paupar bound sequences and 
showed that Paupar and PAX6 co-occupy specific genomic sites 
within the regulatory regions of genes whose expression change 
upon both Pax6 and Paupar knockdown. Furthermore, our data 
show that Paupar does not affect PAX6 chromatin occupancy and 
suggest that Paupar may regulate the association of PAX6 with its 
transcriptional cofactors to control target gene expression. 

It is likely that other nuclear enriched IncRNAs operate in a simi- 
lar manner to Paupar. The CTBPl-AS IncRNA has recently been 
demonstrated to possess cis- as well as trans-acting functions 
(Takayama et al, 2013} while Prncrl and Pcgeml have been shown 
to bind the AR and associate with androgen responsive enhancers 
genome-wide (Yang et al, 2013}. Hotair and Terc occupy hundreds 
of short genomic regions of up to 1 kb in length across multiple 
chromosomes while Drosophila roX2 interacts with Chromosome 
Entry Sites on the X-chromosome (Chu et al, 2011; Simon et al, 
2011}, regions that can recruit the dosage compensation machinery 
when inserted into autosomes (Fagegaltier & Baker, 2004}. Human 
Alu RNA, transcribed from short interspersed elements, binds poly- 
merase II and therefore has the potential to function as a general 
transcriptional repressor (Mariner et al, 2008} while the lateral 
mesoderm specific IncRNA Fendrr interacts in vitro with Foxfl 
promoter fragments in trans (Grote et al, 2013}. We therefore 
propose that Paupar is a member of a class of nuclear enriched 
IncRNAs that can interact with multiple transcriptional enhancers 
and silencers to regulate gene expression in trans in a transcript- 
dependent manner. 

LncRNAs that play important roles in the development and func- 
tion of the nervous system may be dysregulated in neurological dis- 
orders. We have shown that Paupar loss of function disrupts the 
normal cell cycle profile of N2A neuroblastoma cells and induces 
neuronal differentiation. Furthermore, its transcript binds and mod- 
ulates the activity of Sox2, Hesl and £2/2 gene transcriptional regu- 
latory elements in trans, in addition to regulating the expression of 
Paupar's adjacent Pax6 gene. Although the roles of Sox2 and Hesl 
in neural progenitor cell maintenance and neurogenesis are well 
characterised (Pevny & NicoUs, 2010; Kageyama et al, 2008}, the 
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function of the £2/2 gene in neural lineages has been less well 
studied. We found that E2f2, and the related E2f7 and E2f8 genes, 
are up-regulated upon Paupar knockdown in our profiling experi- 
ments and that the Paupar transcript operates to restrict the activity 
of an £2/2 enhancer element. £2/2 functions as a pro-survival factor 
during retinal development (Chen et al, 2009) and is involved in 
maintaining the differentiated state of post-mitotic neurons in cells 
in culture (Persengiev et al, 2001). Furthermore, £2/2 over-expres- 
sion promotes cell cycle arrest and inhibits S-phase progression in 
PC12 neurons (Persengiev et al, 2001). Up-regulation of £2/" family 
members in Paupar knockdown N2A cells may thus contribute to 
the accumulation of cells in S and G2 phases of the cell cycle and 
the observed neural differentiation phenotype. 

The mechanisms of action of a growing number of cis-acting 
IncRNAs have been studied and have revealed roles for both the 
RNA molecule and IncRNA transcription in regulating the expres- 
sion of genomically neighbouring genes (Lai et al, 2013; Melo 
et al, 2013; Wang et al, 2011). By way of contrast, we have 
uncovered the mode of action of a IncRNA molecule that is able 
to regulate the expression of genomically local as well as distal 
genes and suggest that IncRNAs which modulate cellular functions 
via genome-wide targets may be more widespread than previously 
anticipated. 

Materials and Methods 

Plasmid construction 

To generate shRNA expression plasmids we first used the White- 
head Institute siRNA selection program to design shRNAs targeting 
multiple regions of Paupar and Pax6. Selected sequences were 
filtered to eliminate off-target effects by performing a BLAST search 
of the NCBI RefSeq database and removing hits with >15 matched 
bases of the anti-sense strand. Double stranded DNA oligonucleo- 
tides containing sense-loop-antisense targeting sequences were then 
cloned downstream of the U6 promoter in pBS-U5-CMVeGFP 
(Sarker et al, 2005) by linker ligation. To generate the Paupar 
expression plasmid the full length Paupar transcript was PCR ampli- 
fied as a Xhol fragment from mouse N2A cell cDNA and inserted 
into pCAGGS. Paupar CHART-Seq peak regions were PCR cloned 
from N2A genomic DNA and inserted upstream of the SV40 pro- 
moter in pGL3-Pro (Promega) to generate a panel of luciferase 
reporters to test for transcriptional regulatory activity. The 
sequences of all oligonucleotides used for cloning are shown in 
Supplementary Table 1 . 

Cell culture 

N2A mouse neuroblastoma cells were grown in DMEM supple- 
mented with 10% fetal bovine serum. E14 mouse ES cells were 
grown on 0.1% gelatin-coated dishes in DMEM supplemented with 
15% fetal calf serum, Leukemia-Inhibitor factor, fx non-essential 
amino acids, 2 mM L-glutamine, 50 mg/ml penicillin/streptomycin 
and 50 |.iM 2-mercaptoethanol. To induce neuronal differentiation 
ES cells were seeded onto gelatinised plates and grown in differ- 
entiation medium (DMEM supplemented with 10% fetal calf 
serum, Ix non-essential amino acids, 2 mM L-glutamine, 50 mg/ml 



penicillin/streptomycin, 50 |iM 2-mercaptoethanol and 10 M all- 
trans retinoic acid). 

Transcriptomic analysis 

Total RNA was isolated using the Qiagen Mini RNeasy kit following 
the manufacturer's instructions and RNA integrity assessed on a 
BioAnalyzer (Agilent Technologies). 200 ng RNA was used to gener- 
ate labelled sense single stranded DNA (ssDNA) for hybridization 
with the Ambion WT Expression Kit, the Affymetrix WT Terminal 
Labelling and Controls Kit and the Affymetrix Hybridization, Wash, 
and Stain Kit as described by the manufacturer. Sense ssDNA was 
fragmented and the distribution of fragment lengths was measured 
on a BioAnalyzer. Fragmented ssDNA was then labelled and hybrid- 
ized to the Affymetrix GeneChip Mouse Gene 1.0 ST Array (Affyme- 
trix). Chips were processed on an Affymetrix GeneChip Fluidics 
Station 450 and Scanner 3000. Affymetrix GEL files were analysed 
using the Limma, oligo, and genefilter R Bioconductor packages 
(Carvalho & Irizarry, 2010; Smyth, 2004). Arrays were RMA back- 
ground corrected, quantile normalised and summary expression 
values calculated for Refseq and full length mRNAs. Genes changing 
upon Paupar knockdown were filtered to remove genes showing 
little variation in expression (variance cut off of 0.5), while for the 
Pax6 knockdown analysis, genes with consistently low expression 
were removed before the identification of significant changes. In 
each case, differential expression between three knockdown and 
three control samples (biological replicates) was tested using the 
Limma Ebayes algorithm. Gene Ontology analyses were performed 
using GOToolBox, and representative significantly enriched catego- 
ries selected from a hypergeometric test with a Benjamini-Hochberg 
corrected P-value threshold of 0.05 (http://genome.crg.es/GOTool- 
Box/). 

CHART-seq and analysis 

CHART Enrichment and RNase H Mapping experiments were per- 
formed as previously described (Simon, 2013). CHART extract 
was prepared from approximately 8 x lO'' N2A cells per pull down 
and hybridized with 810 pmol biotinylated oligonucleotide cocktail 
(Supplementary Table 1) overnight with rotation at room tempera- 
ture. Complexes were captured using 250 \A MyOneCl streptavidin 
beads (Invitrogen) overnight at room temperature with rotation. 
Bound material was extensively washed and eluted using RNase H 
(New England Biolabs) for 30 min at room temperature. Samples 
were treated with Proteinase K and cross-links were reversed. 
RNA was purified from 1/5 total sample volume using the Qiagen 
miRNeasy kit while DNA was purified from the remaining sample 
by Phenol:CHCl3:isoamyl extraction and ethanol precipitation. 
DNA was sheared to an average fragment size of 150-300 bp 
using a Bioruptor (Diagenode) and sequenced on an lUumina 
HiSeq. 

CHART-seq was performed in replicate with two independent 
pull down samples and matched controls using non-targeting LacZ 
oUgos. A single sample of input DNA from N2A cells was prepared 
and sequenced separately. 50 bp, paired-end reads were mapped to 
the mouse genome (mm9) using bowtie with the options '-ml -v2 - 
best -strata -a'. For each Paupar sample, peaks were called against 
the matched LacZ control and against the N2A input sample. Peak 



© 2014 The Authors 



The EMBO Journal Vol 33 | No 4 | 2014 307 



The EMBO Journal 



Paupar, a trans-acting IncRNA Keith W Vance et al 



calls weie made using the MACS2 algorithm (Zhang et al, 2008; 
https://github.com/taoliu/MACS/blob/master/README) with the 
options '-mfold 10 30 -gsize=2.39e9 -qvalue^O.Ol' using the CGAT 
pipeline 'pipeline_mapping.py' (https://github.com/CGATOxford/ 
cgat). Peak calls were then filtered such that only peak calls with a 
-loglO q value > 5 were retained (FDR 0.001%]. 

Characterisation of Paupar binding sites 

The chromosomal distribution of Paupar peaks was visualised 
using the R Bioconductor package 'ggbio' (Yin et al, 2012]. Gen- 
ome territory enrichments were identified using the Genome Asso- 
ciation Tester (GAT; Heger et al, 2013], using a mappability 
filtered workspace, an isochore file partitioning the genome into 
8 bins based on regional GC content and 10,000 simulations. 
Chromosomal enrichments were analysed by proportionally 
assigning chromosomal territories to a single virtual meta-chromo- 
some before using GAT to test for GC and mappability corrected 
enrichments as before. Peak shapes were visualised using read 
count normalised (MACS2-SPMR], background subtracted 
(MACS2-bdgcmp] coverage tracks from which regions covering 
peaks were extracted and centred based on the location of the 
peak maximum. Gene ontology categories enriched for Paupar 
binding were identified by intersecting regulatory regions for 
known coding genes with Paupar binding sites. Regulatory regions 
for genes were defined as a basal domain surrounding the TSS of 
—5 kb to +1 kb plus an extended domain of upstream and down- 
stream to the nearest gene's basal domain or to a maximum dis- 
tance of 1 Mb, following the GREAT definition (McLean et al, 
2010]. Enrichments were identified with GAT using the regulatory 
regions of all genes as the workspace, and 10,000 simulations. 
Because we noted some correspondence between Paupar binding 
and gene expression level, we supplied GAT with a file stratifying 
the workspace into six bins based on gene expression level in 
N2A cells under the '-isochore' option to conservatively avoid 
associations solely due to expression level. 

Paupar peaks were characterised using DNase I hypersensitiv- 
ity sites identified by the Stamatoyannopoulos lab at the Uni- 
versity of Washington and regulatory elements identified by the 
Ren lab at the Ludwig Institute for Cancer Research (ENCODE 
Project Consortium, 2011]. Enrichments of DNase I HS and regu- 
latory elements overlapping Paupar peaks were assessed using 
GAT to control for mappability and regional GC content as 
before. 

Complementarity of Paupar sequence and binding locations was 
assessed using the EMBOSS Water algorithm (Rice et al, 2000] to 
perform Smith-Waterman alignment with a range of gap opening 
and extension penalties. De novo motif discovery was performed 
using the MEME-ChIP (Machanick & Bailey, 2011] algorithm to 
examine the unmasked DNA sequence of the central regions of top 
scoring (MACS2 peak score] peak locations. MEME-ChIP was run 
with the options '-meme-mod zoops -meme-minw 5 -meme-maxw 
30 — meme-nmotifs 50' using a custom background file prepared 
from regions flanking the peak locations using the command 'fasta- 
get-markov -m 2'. Enrichment of known vertebrate transcription 
factor binding sites from the TRANSFAC Professional database 
(Matys et al, 2006] was assessed using the AME algorithm 
(McLeay & Bailey, 2010] with the options '-method fisher -length- 



correct' using the sequence and background file prepared for 
MEME-ChIP analysis. 

Paupar l<nocl<down and flow cytometry 

Approximately 2 x 10^ cells were plated per well in a six well 
plate. 16-24 h later cells were transfected with 1.5 ng shRNA 
expression construct using FuGENE 6 (Promega] according to the 
manufacturer's instructions. Total RNA was extracted from the 
cells 2-3 days later using the Qiagen Mini RNeasy kit according to 
the manufacturer's instructions. For stable transfections, N2A cells 
were co-transfected with a 5:1 ratio of pBSU6-sh408 expression 
vector and pTK-Hyg (Clontech]. Three days after transfection 
200 ng/ml Hygromycin B was added to the cells and individual 
drug resistant clones were isolated and expanded under selection 
conditions. Individual clones were characterized for Paupar 
expression using qRT-PCR. For flow cytometry, cells were har- 
vested by trypsinization, washed twice with PBS and fixed as a 
single cell suspension in — 20°C filtered 70% ethanol. After 
incubation at 4°C for 10 min cells were pelleted, treated with 
40 i-ig/ml RNase A and propidium iodide (40 ng/ml] for 30 min at 
room temperature and then analysed using a FACSCalibur (BD 
Biosciences] flow cytometer. 

qRT-PCR and RACE 

The QuantiTect Reverse Transcription Kit (Qiagen] was used for 
reverse transcription and followed by SYBR Green quantitative 
PGR using a Step One Plus Real-Time PGR System (Applied 
Biosystems]. RACE was performed using the GeneRacer Kit 
(Invitrogen] following the manufacturer's instructions. Human 
foetal brain RNA was obtained from Promega. Primers are shown 
in Supplementary Table 1. 

Cell fractionation 

Approximately 2.5 x lO'' cells were pelleted, washed with PBS, 
resuspended in 250 |il Lysis Buffer (15 mM HEPES pH7.5, 10 mM 
KCl, 5 mM MgCl2, 0.1 mM EDTA, 0.5 mM EGTA, 250 mM 
Sucrose, 0.4% Igepal, 1 mM DTT, 40 U/ml RNaseOUT (Invitro- 
gen], protease inhibitor cocktail [Roche]] and incubated on ice for 
20 min. Nuclei were centrifuged at 2,000 g for 10 min at 4°C and 
the supernatant was collected as the cytoplasmic fraction. Nuclei 
were then resuspended in 50 |il Nuclei Lysis Buffer (10 mM HE- 
PES pH7.5, 0.1 mM EDTA, 0.1 mM EGTA, 1 mM DTT, 40 U/ml 
RNaseOUT (Invitrogen], protease inhibitor cocktail [Roche]] and 
incubated on ice for 5 min. Nuclei were pelleted at 17,000 g for 
5 min at 4°C and the supernatant was removed as the nucleo- 
plasm fraction. The pellet was then resuspended in 50 \xl Salt 
Extraction Buffer (25 mM HEPES pH7.5, 10% glycerol, 420 mM 
NaCL, 5 mM MgCl2, 0.1 mM EDTA, 1 mM DTT, 40 U/ml 
RNaseOUT (Invitrogen], protease inhibitor cocktail [Roche]] and 
incubated for 30 min at 4°C with rotation. The sample was then 
centrifuged at 17,000 g for 20 min at 4°C. The supernatant was 
collected as the salt extracted fraction and the pellet resuspended 
in 50 1^1 Salt Extraction Buffer to generate the chromatin fraction. 
RNA was isolated from each fraction using the Qiagen Mini 
RNeasy kit following the manufacturer's instructions. 
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UV-RIP 

UV-RIP was performed as described in (Zhao et al, 2010) witli 
minor modifications. Approximately 1 x lO' N2A cells per UV-RIP 
were UV crosslinked in ice-cold PBS at 254 nm, 120 mjoules/cm^ 
using a Stratalinker (Stratagene) . Nuclei were isolated, disrupted by 
sonication (three cycles, 30 sec ON/OFF) using a Bioruptor (Diagen- 
ode] and treated with 20 |.d Turbo DNase (Ambion) before over- 
night incubation with either anti-rabbit PAX6 (AB2237) or rabbit 
IgG (both Millipore] polyclonal antibodies. Complexes were 
captured using Protein-A magnetic beads (Pierce), washed using 
low- and high-stringency buffers and then treated with RNA grade 
Proteinase K (Invitrogen). RNA was extracted using Trizol (Invitrogen) 
and analysed by qRT-PCR. 

Chip 

Chip was performed using approximately 1 x lo'' N2A cells per 
assay. Cells were trypsinized, resuspended in 10 ml PBS containing 
1% final concentration formaldehyde and incubated for 10 min at 
room temperature with rotation. Cross-linking reactions were 
quenched with 0.125 M glycine for 5 min at room temperature and 
washed twice with ice-cold PBS. Nuclei were then isolated and chro- 
matin was sheared to approximately 500 bp using a Bioruptor 
(Diagenode). Cross-linked chromatin was immunoprecipitated using 
5 |.ig anti-rabbit PAX6 or anti-rabbit IgG control antibodies (both 
Millipore) overnight at 4°C. Complexes were collected using 
Protein-A magnetic beads (Pierce) pre-blocked with BSA (New Eng- 
land Biolabs) and transfer RNA (Roche), then washed and eluted. 
Cross-links were reversed at 65°C overnight and DNA was precipi- 
tated, treated with Proteinase K (Roche) and then purified using a 
PCR Purification Kit (Qiagen). 

Luciferase assays 

Approximately 5 x 10* N2A cells were seeded per well in a 12-well 
plate. On the next day, cells were transfected with the indicated 
ratios of reporter constructs and expression vectors using FuGENE 6 
(Promega) according to the manufacturer's instructions. The pRL-tk 
plasmid (Promega) was co-transfected into each well to normalize 
for transfection efficiency. The total amount of DNA was made up 
to 1 ng for each transfection by the addition of empty expression 
vector. Forty-eight hours after transfection lysates were prepared 
and assayed for firefly and renilla luciferase activity. 

Data deposition 

Microarray and CHART-Seq data have been deposited in the 
GEO database under accession number GSE52571 (http://www. 
ncbi.nlm.nih.gov/geo/query/acc.cgi?acc = GSE52571). 

Supplementary information for this article is available online: 
http://emboj.embopress.org. 
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